Generalize table storage and scans by aheev · Pull Request #614 · LadybugDB/ladybug

aheev · 2026-06-23T08:26:41Z

add createScanState
get rid of arrowTable::getNumBatches
rm arrow/icebug-disk dynamic_cast checks wherever possible

aheev · 2026-06-23T14:52:49Z

@adsharma could you PTAL?

aheev · 2026-06-23T15:03:08Z

failing test is unrelated:

EXPECT OK BUT GOT ERROR: IO exception: Failed to create directory /home/runner/.lbdb/extension/0.17.0/linux_amd64/ due to: IO exception: Directory /home/runner/.lbdb/extension/0.17.0/linux_amd64 cannot be created. Check if it exists and remove it.

aheev · 2026-06-23T15:26:27Z

This refactor has been carried out based on the assumption that any new external table formats will strictly conform to ColumnarTableBase. Otherwise common functions like getNumScanMorsels and createScanState would need to be moved to NodeTable / RelTable classes

adsharma · 2026-06-23T16:57:30Z

Flaky github infra. Rerunning makes the tests pass.

adsharma · 2026-06-23T17:10:02Z

How about Generalize instead of Standardize in the PR summary? A few comments about interface design. It has evolved organically and this is probably the first time we're being intentional about it.

On `node_group_idx_t` for `getNumScanMorsels`

It's a semantic mismatch, though it "works" numerically. node_group_idx_t is uint64_t (types.h:55), so it compiles and conveniently slots into the existing numCommittedNodeGroups field. But the type name lies about what's being counted:

IceDisk (ice_disk_node_table.cpp:104): returns parquetReader->getNumRowGroups() — here a "morsel" is a row group, which in Kùzu-style systems often maps to a node group. Defensible.
Arrow (arrow_node_table.cpp:160-168): returns Σ ceil(batchLength / scanMorselSize) — these are genuinely morsels, not node groups. Not defensible to call them node_group_idx_t.

The function is named getNumScanMorsels (good) but typed node_group_idx_t (bad) and its result is stored into numCommittedNodeGroups (scan_node_table.cpp:57, a field that is now misnamed for the columnar path — it holds morsel counts, not node groups). So the type and the consuming field both pre-date the generalization and weren't updated to match the new concept.

Recommendation: return a neutral uint64_t (or introduce a small typedef like morsel_count_t), and either rename the shared-state field that receives it or add a clearly-named alias. Keeping node_group_idx_t propagates a name that only fits one of the three formats into the common base.

On passing `memoryManager` to `createScanState()`

No — it's redundant. Table already stores MemoryManager* memoryManager as a protected member (table.h:219), and ColumnarNodeTableBase inherits it. Every implementation just forwards the parameter straight into the scan-state constructor:

// arrow_node_table.cpp:53-57 — memoryManager only ever passed through
return std::make_unique<ArrowNodeTableScanState>(*memoryManager, nodeIDVector, outVectors,
    nodeIDVector->state);

Same in ice_disk_node_table.cpp:43, arrow_rel_table.cpp:41, ice_disk_rel_table.cpp:25. The table object already has the only MemoryManager that makes sense to use, so the parameter adds a caller burden and a chance to pass the wrong one.

Recommendation: drop the memoryManager parameter; implementations use this->memoryManager. This also removes the inconsistency where the rel signature takes an extra outChunkState the node signature doesn't (see below).

Other observations

1. The operators still type-switch on the base class instead of dispatching virtually. This is the biggest interface smell. scan_node_table.cpp does dynamic_cast<ColumnarNodeTableBase*>(table) in three separate places (createNodeTableScanState :20, initialize :56, nextMorsel :75), and scan_rel_table.cpp/scan_multi_rel_tables.cpp repeat the pattern. The PR built a base class but then kept the external branching. Contrast with initializeScanCoordination, which is a virtual on the base and is called unconditionally (scan_node_table.cpp:54) — that's the pattern done right. createScanState and getNumScanMorsels should follow it: add them as virtuals on NodeTable/Table with default implementations for native tables, eliminating the casts. Right now the design is half-polymorphic.

2. Returning unique_ptr<TableScanState> and immediately downcasting at the call site. Both rel callers downcast the returned base pointer:

// scan_rel_table.cpp:95-96, scan_multi_rel_tables.cpp:89-90
scanState = std::unique_ptr<RelTableScanState>(
    dynamic_cast<RelTableScanState*>(tableScanState.release()));

This signals the return type is too abstract. ColumnarRelTableBase::createScanState should return unique_ptr<RelTableScanState>; ColumnarNodeTableBase::createScanState should return unique_ptr<NodeTableScanState> (or ColumnarNodeTableScanState). The dynamic_cast here is purely working around an over-broad return type.

3. Asymmetric signatures between the node and rel base. Node: createScanState(nodeIDVector, outVectors, mm). Rel: createScanState(nodeIDVector, outVectors, mm, outChunkState). The node version secretly hardcodes nodeIDVector->state as the outChunkState; the rel version needs nbrNodeIDVector->state. If you take outChunkState as an explicit param on both (or derive it consistently), the two bases mirror each other — desirable for a "common columnar base." Right now a reader has to notice the hidden assumption.

4. The two base classes aren't symmetric in their interface. ColumnarNodeTableBase exposes getNumScanMorsels + createScanState; ColumnarRelTableBase exposes only createScanState. That's defensible (rel scans are bound-node-driven, not full-table morsel scans), but it's worth a one-line comment on the rel base explaining why getNumScanMorsels is absent, so a future lance/parquet author doesn't assume the omission is a bug.

5. getNumScanMorsels(transaction) hides wildly different costs. Arrow's is an in-memory arithmetic loop; IceDisk's opens the parquet file and constructs a throwaway ParquetReader every call (ice_disk_node_table.cpp:103) — and getTotalRowCount does it again (:272). The interface looks like a cheap accessor but is I/O-bound for IceDisk. Since it's called during initializeScanCoordination and again from the shared-state init, consider caching the row-group count once (the reader is already constructed per scan-state anyway). Not an interface-shape issue per se, but the const-accessor framing sets the wrong expectation.

6. ColumnarNodeTableScanState takes MemoryManager& mm marked [[maybe_unused]] (columnar_node_table_base.h:21). The [[maybe_unused]] is the tell that the base constructor shouldn't take it — either store it (the subclasses that need it can reach it) or drop it from the base signature. As-is it's dead weight that every subclass forwards.

7. createScanState is const but builds state holding mutable resources (e.g. IceDiskNodeTableScanState owns a ParquetReader that performs writes/caching). Fine in practice, but slightly surprising for a const method; worth a comment if intentional.

Summary of the highest-value changes

Drop memoryManager from createScanState (redundant with Table::memoryManager).
Replace node_group_idx_t with a neutral count type on getNumScanMorsels, and fix the numCommittedNodeGroups field naming on the columnar path.
Push createScanState/getNumScanMorsels down as virtuals with native defaults so the scan operators stop dynamic_cast-ing — that's what completes the generalization the PR is aiming for.
Narrow the return types (RelTableScanState / NodeTableScanState) to kill the call-site downcasts.

aheev · 2026-06-24T02:15:10Z

This refactor has been carried out based on the assumption that any new external table formats will strictly conform to ColumnarTableBase. Otherwise common functions like getNumScanMorsels and createScanState would need to be moved to NodeTable / RelTable classes

The operators still type-switch on the base class instead of dispatching virtually was my concern here. It will break the ABI. Do you still want me to proceed?

aheev · 2026-06-24T04:05:48Z

On node_group_idx_t for getNumScanMorsels

It's a semantic mismatch, though it "works" numerically. node_group_idx_t is uint64_t (types.h:55), so it compiles and conveniently slots into the existing numCommittedNodeGroups field. But the type name lies about what's being counted:

IceDisk (ice_disk_node_table.cpp:104): returns parquetReader->getNumRowGroups() — here a "morsel" is a row group, which in Kùzu-style systems often maps to a node group. Defensible.

Arrow (arrow_node_table.cpp:160-168): returns Σ ceil(batchLength / scanMorselSize) — these are genuinely morsels, not node groups. Not defensible to call them node_group_idx_t.

The function is named getNumScanMorsels (good) but typed node_group_idx_t (bad) and its result is stored into numCommittedNodeGroups (scan_node_table.cpp:57, a field that is now misnamed for the columnar path — it holds morsel counts, not node groups). So the type and the consuming field both pre-date the generalization and weren't updated to match the new concept.

Recommendation: return a neutral uint64_t (or introduce a small typedef like morsel_count_t), and either rename the shared-state field that receives it or add a clearly-named alias. Keeping node_group_idx_t propagates a name that only fits one of the three formats into the common base.

ColumnarTables use NodeTable's functions which expect node_group_idx_t

Other observations

1. The operators still type-switch on the base class instead of dispatching virtually. This is the biggest interface smell. scan_node_table.cpp does dynamic_cast<ColumnarNodeTableBase*>(table) in three separate places (createNodeTableScanState :20, initialize :56, nextMorsel :75), and scan_rel_table.cpp/scan_multi_rel_tables.cpp repeat the pattern. The PR built a base class but then kept the external branching. Contrast with initializeScanCoordination, which is a virtual on the base and is called unconditionally (scan_node_table.cpp:54) — that's the pattern done right. createScanState and getNumScanMorsels should follow it: add them as virtuals on NodeTable/Table with default implementations for native tables, eliminating the casts. Right now the design is half-polymorphic.

#614 (comment)

2. Returning unique_ptr<TableScanState> and immediately downcasting at the call site. Both rel callers downcast the returned base pointer:
// scan_rel_table.cpp:95-96, scan_multi_rel_tables.cpp:89-90
scanState = std::unique_ptr<RelTableScanState>(
    dynamic_cast<RelTableScanState*>(tableScanState.release()));
This signals the return type is too abstract. ColumnarRelTableBase::createScanState should return unique_ptr<RelTableScanState>; ColumnarNodeTableBase::createScanState should return unique_ptr<NodeTableScanState> (or ColumnarNodeTableScanState). The dynamic_cast here is purely working around an over-broad return type.

The motivation behind returning a TableScanState is scanState is used in functions such as scanInternal, initScanState, etc which expect a TableScanState

3. Asymmetric signatures between the node and rel base. Node: createScanState(nodeIDVector, outVectors, mm). Rel: createScanState(nodeIDVector, outVectors, mm, outChunkState). The node version secretly hardcodes nodeIDVector->state as the outChunkState; the rel version needs nbrNodeIDVector->state. If you take outChunkState as an explicit param on both (or derive it consistently), the two bases mirror each other — desirable for a "common columnar base." Right now a reader has to notice the hidden assumption.

Node and rel tables are meant to be different. Moreover we don't have a common ColumnarTableBase

4. The two base classes aren't symmetric in their interface. ColumnarNodeTableBase exposes getNumScanMorsels + createScanState; ColumnarRelTableBase exposes only createScanState. That's defensible (rel scans are bound-node-driven, not full-table morsel scans), but it's worth a one-line comment on the rel base explaining why getNumScanMorsels is absent, so a future lance/parquet author doesn't assume the omission is a bug.

could be done as part of #614 (comment)

7. createScanState is const but builds state holding mutable resources (e.g. IceDiskNodeTableScanState owns a ParquetReader that performs writes/caching). Fine in practice, but slightly surprising for a const method; worth a comment if intentional.

It doesn't modify members of the class 🤔

adsharma · 2026-06-24T16:12:25Z

The larger question is:

Table  
  └── NodeTable  
        └── ColumnarNodeTableBase  
              ├── IceDiskNodeTable  
              └── ArrowNodeTable

vs

Table  
  └── ColumnarNodeTableBase (base)  
        ├── NodeTable (specializes for node groups)  
                   └── IceDiskNodeTable
                               └── Parquet Based
                                └── Foreign (subclassed in an extension)
        └── ArrowNodeTable (uses Arrow batches)

We arrived at the current design with the idea that NodeTable contains a lot of well tested code. Let's not break anything and do the least disruptive thing. That's the reason for some of the interface smell we have.

But you bring up a good point about breaking ABI. Is it the extension ABI you're talking about or something else?

aheev · 2026-06-25T14:15:12Z

The larger question is:
Table  
  └── NodeTable  
        └── ColumnarNodeTableBase  
              ├── IceDiskNodeTable  
              └── ArrowNodeTable  
vs
Table  
  └── ColumnarNodeTableBase (base)  
        ├── NodeTable (specializes for node groups)  
                   └── IceDiskNodeTable
                               └── Parquet Based
                                └── Foreign (subclassed in an extension)
        └── ArrowNodeTable (uses Arrow batches)  
We arrived at the current design with the idea that NodeTable contains a lot of well tested code. Let's not break anything and do the least disruptive thing. That's the reason for some of the interface smell we have.

But you bring up a good point about breaking ABI. Is it the extension ABI you're talking about or something else?

I will get back to it by tmrw

aheev · 2026-06-26T09:30:27Z

The larger question is:

Table  
  └── NodeTable  
        └── ColumnarNodeTableBase  
              ├── IceDiskNodeTable  
              └── ArrowNodeTable

vs

Table  
  └── ColumnarNodeTableBase (base)  
        ├── NodeTable (specializes for node groups)  
                   └── IceDiskNodeTable
                               └── Parquet Based
                                └── Foreign (subclassed in an extension)
        └── ArrowNodeTable (uses Arrow batches)

Ideally it should be the 2nd one. But before designing top interfaces i.e., ColumnarNodeTableBase and ColumnarRelTableBase, we should make it generic enough and closed (for changes) by weighing what should be part ofthem: read functions, write functions, read scanStates, write scanStates, index functions

PS: Not sure why ArrowNodeTable is separated out. It plays very well with other columnar formats

We arrived at the current design with the idea that NodeTable contains a lot of well tested code. Let's not break anything and do the least disruptive thing. That's the reason for some of the interface smell we have.

100% agree. The idea is to add virtual empty functions and override them only in non-native tables

But you bring up a good point about breaking ABI. Is it the extension ABI you're talking about or something else?

if the extensions are the only ones consuming the ladybug ABI, then yes. Since we build and release the core and extensions together, I believe the rist is low 🤔 . The only issue is loading extensions built against old ladybug code

adsharma · 2026-06-26T16:24:31Z

PS: Not sure why ArrowNodeTable is separated out. It plays very well with other columnar formats

Arrow is special in that it doesn't have the concept of a Node/Row group and we're externally imposing it. It's a memory based format that provides random access vs other formats being block based.

The only issue is loading extensions built against old ladybug code

This is something we don't support today, but it's a good goal to shoot for. There is a period after the release (such as 0.17.1) and the next release (0.18.x) when you can't use published extensions if ABI breaking changes went in.

Sounds like we have consensus that the second design is preferable, but the cost of doing it now may be high?

Also Lance may be different from other parquet like formats in aspiring to be a single file database (with indexes, incremental updates etc) as opposed to being a generic/simpler read-only format like parquet. In that sense it might be closer to ladybug native tables, not parquet.

aheev · 2026-06-27T02:10:07Z

Sounds like we have consensus that the second design is preferable, but the cost of doing it now may be high?

Yeah. Refactor would be huge, but we will be risking just the Table casts. Agree to doing it post release if that's where you're going

Also Lance may be different from other parquet like formats in aspiring to be a single file database (with indexes, incremental updates etc) as opposed to being a generic/simpler read-only format like parquet. In that sense it might be closer to ladybug native tables, not parquet.

Yeah. However, right now every table has its own folder containing both its data and metadata. Not sure about the future. That's where lance catalog catalog can be used

Should we consult lance folks?

adsharma · 2026-06-27T15:47:23Z

I'm thinking all of this is post 0.18.x release.

Lance catalog: we should distinguish the two use cases:

Lance tables attached directly (could work the same way as duckdb catalog attached to ladybugdb)
Lance tables coming from icedisk - we should treat Lance like parquet and keep the catalog in schema.cypher or metadata.lbdb)

Would be great to get review from Lance folks if possible.

aheev · 2026-06-28T03:26:01Z

I'm thinking all of this is post 0.18.x release.

Lance catalog: we should distinguish the two use cases:

Lance tables attached directly (could work the same way as duckdb catalog attached to ladybugdb)

Lance tables coming from icedisk - we should treat Lance like parquet and keep the catalog in schema.cypher or metadata.lbdb)

Would be great to get review from Lance folks if possible.

I don't think it's possible to get a review from them. They hardly respond on lance repos 🙃 . I can ask them about catalog vs non-catalog

adsharma · 2026-06-28T05:42:44Z

deepwiki says you can list_tables(). But not clear if Lance's C bindings support it. May be rust only.

aheev · 2026-06-30T16:27:39Z

No reply from lance maintainers. I think it's better follow lance-graph: support dataset URI and namespaces

aheev added 4 commits June 22, 2026 21:54

add TableStorageFormat enum

4358231

standardise scans

a865aba

fix tests

2699bba

rm TableStorageFormat

e15d5e7

aheev marked this pull request as ready for review June 23, 2026 14:45

aheev changed the title ~~standardize table storage and scans~~ Generalize table storage and scans Jun 24, 2026

optimising scan states

9b536cb

rm memory_manager param

c8885fb

aheev force-pushed the std-storage branch from 35de490 to c8885fb Compare June 24, 2026 04:30

aheev added 2 commits June 24, 2026 10:15

revert nextMorsel

13a4513

fix scan_multi_rel_tables bug

30451cb

Uh oh!

Conversation

aheev commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aheev commented Jun 23, 2026

Uh oh!

aheev commented Jun 23, 2026

Uh oh!

aheev commented Jun 23, 2026

Uh oh!

adsharma commented Jun 23, 2026

Uh oh!

adsharma commented Jun 23, 2026

On node_group_idx_t for getNumScanMorsels

On passing memoryManager to createScanState()

Other observations

Summary of the highest-value changes

Uh oh!

aheev commented Jun 24, 2026

Uh oh!

aheev commented Jun 24, 2026

On node_group_idx_t for getNumScanMorsels

Other observations

Uh oh!

adsharma commented Jun 24, 2026

Uh oh!

aheev commented Jun 25, 2026

Uh oh!

aheev commented Jun 26, 2026

Uh oh!

adsharma commented Jun 26, 2026

Uh oh!

aheev commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adsharma commented Jun 27, 2026

Uh oh!

aheev commented Jun 28, 2026

Uh oh!

adsharma commented Jun 28, 2026

Uh oh!

aheev commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aheev commented Jun 23, 2026 •

edited

Loading

On `node_group_idx_t` for `getNumScanMorsels`

On passing `memoryManager` to `createScanState()`

On `node_group_idx_t` for `getNumScanMorsels`

aheev commented Jun 27, 2026 •

edited

Loading